# DPO Fine-tuning

L3.3 GeneticLemonade Unleashed V3 70B
This is a 70B-parameter large language model based on Llama 3.3, fine-tuned with SFT+DPO, specializing in character-driven dialogues and creative content generation
Large Language Model Transformers
L
zerofata
119
5
Qwen2.5 14B Dpo It Ties
An enhanced version of the Qwen2.5-14B model fused using the TIES method, focusing on instruction following and dialogue optimization
Large Language Model Transformers
Q
mergekit-community
30
2
Tanuki 8B Dpo V1.0
Apache-2.0
Tanuki-8B is an 8B-parameter Japanese large language model optimized for dialogue tasks through SFT and DPO, developed by GENIAC Matsuo Lab
Large Language Model Transformers Supports Multiple Languages
T
weblab-GENIAC
1,143
41
Flammen21 Mistral 7B
Apache-2.0
Based on the Mistral 7B large language model, fine-tuned through pre-trained model merging on the Date-DPO-v2 dataset, excelling in role-playing, creative writing, and general intelligent tasks.
Large Language Model Transformers
F
flammenai
23
1
Starchat2 15b V0.1
StarChat2 is a 16-billion-parameter programming assistant model fine-tuned based on StarCoder2, excelling in conversation and code generation tasks
Large Language Model Transformers
S
HuggingFaceH4
4,196
111
Zephyr 7b Gemma V0.1
Other
Zephyr 7B Gemma is a language model fine-tuned based on google/gemma-7b, trained on publicly available synthetic datasets using Direct Preference Optimization (DPO), designed to serve as a helpful assistant.
Large Language Model Transformers
Z
HuggingFaceH4
502
124
Eeve Dpo V3
Apache-2.0
Korean instruction-optimized model based on EEVE-Korean-Instruct-10.8B-v1.0, trained using Direct Preference Optimization (DPO) method
Large Language Model Transformers
E
ENERGY-DRINK-LOVE
1,803
1
Minueza 32M Chat
Apache-2.0
Minueza-32M-Chat is a chat model with 32 million parameters, based on Felladrin/Minueza-32M-Base and trained with supervised fine-tuning (SFT) and direct preference optimization (DPO).
Large Language Model Transformers English
M
Felladrin
77
9
Olmo 7B Instruct
Apache-2.0
OLMo 7B Instruct is an open language model trained on the Dolma dataset, optimized with SFT and DPO, specifically designed for QA tasks.
Large Language Model Transformers English
O
allenai
365
53
Causallm 14B DPO Alpha GGUF
A 14B-parameter causal language model optimized with DPO, supporting English-Chinese text generation tasks
Large Language Model Supports Multiple Languages
C
tastypear
2,238
85
14B
A 14B-parameter causal language model fully compatible with Meta LLaMA 2 architecture, outperforming all sub-70B models in multiple benchmarks
Large Language Model Transformers Supports Multiple Languages
1
CausalLM
236
303
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase